Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-47535: Actually save calibrated catalog in ReprocessVisitImage. #81

Merged
merged 4 commits into from
Dec 4, 2024

Conversation

TallJimbo
Copy link
Member

No description provided.

@@ -491,7 +491,11 @@ def run(
result.exposure.info.setSummaryStats(
self.compute_summary_stats.run(result.exposure, result.sources_footprints, background)
)
self._apply_photo_calib(result.exposure, result.sources_footprints, photo_calib)
result.sources_footprints = self._apply_photo_calib(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, shoot! Thanks for catching this. I see I didn't have a test that the _flux columns were in the output in test_reprocess_visit_image.py. Can you please add that? Either a test of existence, and/or a check that the brightest PsfFlux_flux value matches self.photo_calib*self.fluxes[-1] (which is the brightest in the input catalog).

Also, while you're at it, please fix _apply_photo_calib() to not re-use sources_footprints, since that might be how I managed to confuse myself. photo_calib.calibrateCatalog not being in-place is different from many of our other catalog operations, but there's a good reason it can't be in-place. But I still screwed it up.

Copy link
Member Author

@TallJimbo TallJimbo Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In thinking about adding at test to drp_tasks (note that this is now already tested in ci_hsc and ci_imsim) I thought, "oh, I'll test that the schema init-output is consistent with the output catalog." And then I realized that the schema is not going to be consistent, because it doesn't have the flux fields, and we don't have a good way to make it consistent.

At this point I'm inclined to just remove the sources_schema connection, as I don't anticipate anything consuming it, but I'd be interested in your thoughts on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nevermind, I bet we do have an easy way to fix it: since the init-output is a catalog with no rows, I think we can just call calibrateCatalog with a dummy PhotoCalib in __init__.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, given that the tests in drp_pipe are all mocks-based, adding a test there is considerably harder than upgrading the ones in ci_hsc and ci_hsc, so that's what I've done - those now check that the schema and the output catalog are consistent, and the existing test on LocalPhotoCalib checks that the flux columns are present. I'm starting a new Jenkins run now to make sure that all works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch: I'd forgotten about that schema; we probably want a check in this package's tests that the sources_schema produced by __init__ matches the schema of the output sources catalog, since it's not something that needs ci_* to check.

I added that schema connection because when I was working on reprocessVisitImage, I had to add an output schema to finalizeCharacterization to take as an input here, even though Eli had had a similar thought of "this is the final point in the pipeline, nobody's ever going to need the schema".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want us to push tests into ci_* that could really be done in this package instead. Those packages take far too long to run, and I think we already have too much testing dependency on them.

Copy link
Member Author

@TallJimbo TallJimbo Dec 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought, but the tests in drp_tasks don't actually call run; that's completely mocked out. So there's no source catalog here to check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? Yes they do! You can put that test at the end of this one!

https://github.com/lsst/drp_tasks/blob/main/tests/test_reprocess_visit_image.py#L186

Copy link
Member Author

@TallJimbo TallJimbo Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I saw that the test case was called CalibrateImageTasksTests and didn't look deeper, but now that I have I think that name is a copy-paste bug. Anyhow, yes, I'll put the new test there (and rename the test case).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, tests are in place in drp_tasks.

@TallJimbo TallJimbo force-pushed the tickets/DM-47535 branch 4 times, most recently from 376f658 to a8a448f Compare December 2, 2024 14:53
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the added test. I think you could remove the equivalent tests from ci_imsim/ci_hsc, but it's up to youj.

@parejkoj
Copy link
Contributor

parejkoj commented Dec 3, 2024

Oh, and thanks for fixing my copypasta test name!

This adds a new `self.schema` attribute which is actually a Schema,
not the single-row SourceCatalog required for an init-output like
`self.sources_schema`, for use in most methods and subtasks, since
those don't want the flux fields to be present.
@TallJimbo TallJimbo merged commit 752c614 into main Dec 4, 2024
3 checks passed
@TallJimbo TallJimbo deleted the tickets/DM-47535 branch December 4, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants